Search results for "Language identification"
showing 10 items of 10 documents
Searching for a Common Language
2010
Perhaps you remember Steven Spielberg’s movie released in 1977, Close Encounters of the Third Kind. The memorable end of this movie shows how humans might communicate with extraterrestrials through the interchange of music, with the famous melody “D E C C G” that we all hummed.
Systems, models and languages
2010
This paper presents a comparison of language aspects in a model and a meta-model. The motivation is to get a better understanding of how we should define a modeling language.
Implications of Theories of Language for Information Systems
1985
This article demonstrates how language views can be adopted into an information systems context. We distinguish here between five language views: denotational, generative, cognitive, behavioristic, and interactionist. These views differ in their assumptions about he origin of linguistic behavior, the primary functions of language, elements of language, and the nature of linguistic knowledge. Information system development approaches can be characterized by their underlying language views. This explains great differences in development methods and research. Thus, language views have implications and should be chosen continency for a given information system context.
A Comparison of Language Identification Approaches on Short, Query-Style Texts
2010
In a multi-language Information Retrieval setting, the knowledge about the language of a user query is important for further processing. Hence, we compare the performance of some typical approaches for language detection on very short, query-style texts. The results show that already for single words an accuracy of more than 80% can be achieved, for slightly longer texts we even observed accuracy values close to 100%.
From a bodily-based format of knowledge to symbols. The evolution of human language
2013
Although ontogeny cannot recapitulate phylogeny, a two-level model of the acquisition of language will be here proposed and its implication for the evolution of the faculty of language will be discussed. It is here proposed that the identification of the cognitive requirements of language during ontogeny could help us in the task of identifying the phylogenetic achievements that concurred, at some point, to the acquisition of language during phylogeny. In this model speaking will be considered as a complex ability that arises in two different steps. The first step of competence widely relies on a bodily-based format of knowledge. The second step relies on more abstract meta-representations …
Language Detection and Tracking in Multilingual Documents Using Weak Estimators
2010
Published version of an article from the book: Structural, Syntactic, and Statistical Pattern Recognition . The original publication is available at Spingerlink. http://dx.doi.org/DOI: 10.1007/978-3-642-14980-1_59 This paper deals with the extremely complicated problem of language detection and tracking in real-life electronic (for example, in Word-of-Mouth (WoM)) applications, where various segments of the text are written in different languages. The difficulties in solving the problem are many-fold. First of all, the analyst has no knowledge of when one language stops and when the next starts. Further, the features which one uses for any one language (for example, the n-grams) will not be…
Embedded controlled language to facilitate information extraction from eGov policies
2015
The goal of this paper is to propose a system that can extract formal semantic knowledge representation from natural language eGov policies. We present an architecture that allows for extracting Controlled Natural Language (CNL) statements from heterogeneous natural language texts with the ability to support multilinguality. The approach is based on the concept of embedded CNLs.
THE USE OF WEAK ESTIMATORS TO ACHIEVE LANGUAGE DETECTION AND TRACKING IN MULTILINGUAL DOCUMENTS
2013
This paper deals with the problems of language detection and tracking in multilingual online short word-of-mouth (WoM) discussions. This problem is particularly unusual and difficult from a pattern recognition perspective because, in these discussions, the participants and content involve the opinions of users from all over the world. The nature of these discussions, consisting of multiple topics in different languages, presents us with a problem of finding training and classification strategies when the class-conditional distributions are nonstationary. The difficulties in solving the problem are many-fold. First of all, the analyst has no knowledge of when one language stops and when the…
Translingual text mining for identification of language pair phenomena
2016
Translingual Text Mining (TTM) is an innovative technology of natural language processing for building multilingual parallel corpora, processing machine translation, contextual knowledge acquisition, information extraction, query profiling, language modeling, contextual word sensing, creating feature test sets and for variety of other purposes. The Keynote Lecture will discuss opportunities and challenges of this computational technology. In particular, the focus will be made on identification of language pair phenomena and their applications to building holistic language model which is a novel tool for processing machine translation, supporting professional translations, evaluation of tran…
Usage of HMM-Based Speech Recognition Methods for Automated Determination of a Similarity Level Between Languages
2019
The problem of automated determination of language similarity (or even defining of a distance on the space of languages) could be solved in different ways – working with phonetic transcriptions, with speech recordings or both of them. For the recordings, we propose and test a HMM-based one: in the first part of our article we successfully try language detection, afterwards we are trying to calculate distances between HMM-based models, using different metrics and divergences. The Kullback-Leibler divergence is the only one we got good results with – it means that the calculated distances between languages correspond to analytical understanding of similarity between them. Even if it does not …